Careers

←Job Openings

Hadoop Developer - San Jose, CA Refer a Person Apply for this Job

Job Description

Lead the architecture, design, and development of components and services to enable Machine Learning at scale.
Responsible for data ingestion from disparate data sources like SAP sources and non-SAP sources such as iEnergy, MDMS and maintaining and extending the big data platform infrastructure that supports client’s business use cases.
Identify and recommend the most appropriate paradigms and technology choices for batch and real-time scenarios.
Work on finding cluster level solutions for our complex system and developed enterprise level applications followed by unit testing.
Building the pipelines from Source to SFTP, SFTP to Hadoop landing layer using Talend.
Develop an automated data ingestion framework using Talend to synchronize the Hadoop data with SAP HANA and vice-versa.
Run complex queries and work on Bucketing, Partitioning, Joins and sub-queries.
Write Big Data Advanced business application programs code in both functional and object-oriented programming.
Implement of the complex transformations and actions using the dataframes and data sets from SPARK/SCALA.
Develop standalone applications in Spark/Scala that reads error logs from multiple upstream data sources and run validations on it.
Write build scripts to build applications using tools like Apache Maven, Ant, Sbt and deploy the code using Jenkins for CI/CD.
Work on writing complex workflow jobs using Redwood and set up multiple programs scheduler system which helped in managing multiple Hadoop, Hive, Sqoop, Spark jobs.
Closely monitor the pipeline jobs and worked on failed jobs. Deal with setting up several new property configurations within Redwood SC.
Work on developing Kafka producers that listen to several streaming data with-in a specified duration.
Teach and mentor other engineers on the team.
Document the functional and technical requirements by following company defined processes and methodologies.
Perform data cleanups and validations on streaming data using spark, spark streaming and Scala.

Required Skills:

A minimum of bachelor's degree in computer science or equivalent.
Cloudrea Hadoop(CDH), Cloudera Manager, Informatica Bigdata Edition(BDM), HDFS, Yarn, MapReduce, Hive, Impala, KUDU, Sqoop, Spark, Kafka, HBase, Teradata Studio Express, Teradata, Tableau, Kerberos, Active Directory, Sentry, TLS/SSL, Linux/RHEL, Unix Windows, SBT, Maven, Jenkins, Oracle, MS SQL Server, Shell Scripting, Eclipse IDE, Git, SVN
Must have strong problem-solving and analytical skills
Must have the ability to identify complex problems and review related information to develop and evaluate options and implement solutions.

If you are interested in working in a fast-paced, challenging, fun, entrepreneurial environment and would like to have the opportunity of being a part of this fascinating industry, Send resumes. to HSTechnologies LLC, 2801 W Parker Road, Suite #5 Plano, TX - 75023 or email your resume to hr@sbhstech.com.